智能论文笔记

Deep Reinforcement Learning-Assisted Federated Learning for Robust Short-term Utility Demand Forecasting in Electricity Wholesale Markets

Chenghao Huang , Weilong Chen , Xiaoyi Wang , Feng Hong , Shunji Yang , Yuxi Chen , Shengrong Bu , Changkun Jiang , Yingjie Zhou , Yanru Zhang

分类：机器学习

2022-06-23

短期负载预测（STLF）在电力交易市场的运营中起着重要作用。考虑到对数据隐私的日益关注，在最近的研究中，越来越多地采用了联合学习（FL）来培训公用事业公司（UCS）的STLF模型。令人鼓舞的是，在批发市场中，由于发电厂（PPS）直接访问UCS数据并不现实，因此FL绝对是可行的解决方案，可以为PPS获得准确的STLF模型。但是，由于FL的分布性质和UC之间的激烈竞争，缺陷越来越多，导致STLF模型的性能差，表明仅采用FL是不够的。在本文中，我们提出了一种DRL辅助方法，缺陷感知的联合软性参与者 - 批评者（DearFSAC），以稳健地训练PPS的准确的STLF模型，以预测精确的短期公用事业需求。首先。我们仅使用历史负载数据和时间数据设计了基于长期短期内存（LSTM）的STLF模型。此外，考虑到缺陷发生的不确定性，采用了深入的增强学习（DRL）算法来通过减轻缺陷引起的模型退化来协助FL。此外，为了更快的FL训练融合，自动编码器设计用于缩小尺寸和上载模型的质量评估。在模拟中，我们在2019年验证了赫尔辛基UCS的真实数据的方法。结果表明，无论是否发生缺陷，DearFSAC都比所有其他方法都胜过所有其他方法。

translated by 谷歌翻译

Text as Neural Operator: Image Manipulation by Text Instruction

Tianhao Zhang , Hung-Yu Tseng , Lu Jiang , Weilong Yang , Honglak Lee , Irfan Essa

分类：计算机视觉

2020-08-11

近年来，文本引导的图像操纵在多媒体和计算机视觉社区中获得了越来越多的关注。条件图像生成的输入已从图像 - 仅推向多模。在本文中，我们研究一个设置，允许用户使用复杂的文本指令编辑具有多个对象的图像以添加，删除或更改对象。任务的输入是多模式，包括（1）参考图像和（2）自然语言的指令，其描述对图像的期望修改。我们提出了一种基于GaN的方法来解决这个问题。关键的想法是将文本视为神经运算符，以在本地修改图像功能。我们表明，拟议的模型对三个公共数据集的最近强大的基线进行了有利的。具体地，它产生更高保真度和语义相关性的图像，并且当用作图像查询时，导致更好的检索性能。

translated by 谷歌翻译

FewFedWeight: Few-shot Federated Learning Framework across Multiple NLP Tasks

Weilong Dong , Xinwei Wu , Junzhuo Li , Shuangzhi Wu , Chao Bian , Deyi Xiong

分类：自然语言处理

2022-12-16

Massively multi-task learning with large language models has recently made substantial progress on few-shot generalization. However, this is usually performed in a centralized learning fashion, ignoring the privacy sensitivity issue of (annotated) data used in multiple tasks. To mitigate this issue, we propose FewFedWeight, a few-shot federated learning framework across multiple tasks, to achieve the best of both worlds: privacy preservation and cross-task generalization. FewFedWeight trains client models in isolated devices without sharing data. It broadcasts the global model in the server to each client and produces pseudo data for clients so that knowledge from the global model can be explored to enhance few-shot learning of each client model. An energy-based algorithm is further proposed to weight pseudo samples in order to reduce the negative impact of noise from the generated pseudo data. Adaptive model weights of client models are also tuned according to their performance. We use these model weights to dynamically aggregate client models to update the global model. Experiments on 118 NLP tasks show that FewFedWeight can significantly improve the performance of client models on 61% tasks with an average performance improvement rate of 30.5% over the baseline and substantially outperform FedAvg and other decentralized learning methods.

translated by 谷歌翻译

Swing Distillation: A Privacy-Preserving Knowledge Distillation Framework

Junzhuo Li , Xinwei Wu , Weilong Dong , Shuangzhi Wu , Chao Bian , Deyi Xiong

分类：机器学习 | 人工智能

2022-12-16

Knowledge distillation (KD) has been widely used for model compression and knowledge transfer. Typically, a big teacher model trained on sufficient data transfers knowledge to a small student model. However, despite the success of KD, little effort has been made to study whether KD leaks the training data of the teacher model. In this paper, we experimentally reveal that KD suffers from the risk of privacy leakage. To alleviate this issue, we propose a novel knowledge distillation method, swing distillation, which can effectively protect the private information of the teacher model from flowing to the student model. In our framework, the temperature coefficient is dynamically and adaptively adjusted according to the degree of private information contained in the data, rather than a predefined constant hyperparameter. It assigns different temperatures to tokens according to the likelihood that a token in a position contains private information. In addition, we inject noise into soft targets provided to the student model, in order to avoid unshielded knowledge transfer. Experiments on multiple datasets and tasks demonstrate that the proposed swing distillation can significantly reduce (by over 80% in terms of canary exposure) the risk of privacy leakage in comparison to KD with competitive or better performance. Furthermore, swing distillation is robust against the increasing privacy budget.

translated by 谷歌翻译

Simulating financial time series using attention

Weilong Fu , Ali Hirsa , Jörg Osterrieder

分类：机器学习

2022-07-01

财务时间序列仿真是一个核心主题，因为它扩展了有限的实际数据，用于培训和评估交易策略。由于真实财务数据的复杂统计特性，这也是一项挑战。我们介绍了两个生成的对抗网络（GAN），该网络利用引起注意的卷积网络和变压器进行财务时间序列模拟。甘斯以数据驱动的方式学习统计属性，注意机制有助于复制远程依赖性。在标准普尔500指数和期权数据上测试了所提出的gan，根据风格化的事实对分数进行了检查，并与纯卷积GAN（即Quantangan）进行了比较。基于注意力的甘斯不仅重现了风格化的事实，而且还要平滑回报的自相关。

translated by 谷歌翻译

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

Zhihao Du , Shiliang Zhang , Siqi Zheng , Weilong Huang , Ming Lei

分类：机器学习

2021-11-28

重叠的言语日期始终被视为多标签分类问题。在本文中，通过使用电源集编码多扬声器标签，我们将此任务重新格式化为单个标签预测问题。具体地，我们提出了扬声器嵌入感知的神经日复日复速节（发送）方法，其根据语音特征和给定扬声器嵌入的相似性预测电力集编码标签。我们的方法通过利用之前的文献中未能很好地研究，进一步扩展并与下游任务集成在一起。实验结果表明，我们的方法达到了比目标扬声器语音活动检测更低的日益缓释误差率。当涉及文本信息时，可以进一步降低日复速度误差。对于真正的会议场景，与基于贝叶斯隐马尔可夫模型的聚类算法相比，我们的方法可以实现相对改进34.11％。

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

PMT-IQA: Progressive Multi-task Learning for Blind Image Quality Assessment

Qingyi Pan , Ning Guo , Letu Qingge , Jingyi Zhang , Pei Yang

分类：计算机视觉

2023-01-03

Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

KoopmanLab: A PyTorch module of Koopman neural operator family for solving partial differential equations

Wei Xiong , Muyuan Ma , Pei Sun , Yang Tian

分类：机器学习

2023-01-03

Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.

translated by 谷歌翻译